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15A General Linear Least Squares 

An immediate generalization of §15.2 is to fit a set of data points (it, y%) to a 
model that is not just a linear combination of 1 and x (namely a + bx) t but rather a 
linear combination of any M specified functions of x. For example, the functions 
could be 1, x, x 2 , ... , x M ~ l , in which case their general linear combination, a: 3 J g> g> 

y (x) = ai + a 2 x + a 3 x 2 + • • • + a M * M ~ l (15.4.1) 1 1 Iff 

is a polynomial of degree M — 1. Or, the functions could be sines and cosines, in b § -g<§ 
which case their general linear combination is a harmonic series. 8 s$ £ 3 

The general form of this kind of model is 1 §. f 2 1 

M |.<o 3* <o § 

y(x)^a k X k (x) ■ . (15.4.2) |ff*§. 



N 

* 2 = E 



where Xi(i),...,1mW are arbitrary fixed functions of x, called the basis ?o||m 

Junctions. ^ 0 =5 

Note that the functions Xfe(x) can be wildly nonlinear functions of x. In this -5 1 1 = « 

discussion "linear" refers only to the model's dependence on its parameters a % 2 5" I P 

For these Unear models we generalize the discussion of the previous section 3 f § % 5 

by defining a merit function « > 



(15.4.3) . f|SlS 



As before, cr* is the measurement error (standard deviation) of the ith data point, a||o 3 

presumed to be known. If the measurement errors are not known, they may all (as | g i 0 3 

discussed at the end of §15.1) be set to the constant value a — 1. J H -n § 

Once again, we will pick as best parameters those that minimiz e x 2 - There are 3 ° 1 si 1 

several different techniques available for finding this minimum. Two are particularly | 5 s. g g 

useful, and we will discuss both in this section. To introduce them and elucidate £ | 8 go 

their relationship, we need some notation. 2 S ? £ gj 

Let A be a matrix whose TV x M components are constructed from the M 3 ^ ? c 9 

basis functions evaluated at the TV abscissas a:*, and from the TV measurement errors | 8 3 | § 

cri, by the prescription <§ 8 1. 8 & 

\r / \ O o~ 2. 2) o 



1 c/> 

r«f - 
< co 



,4,, = ^ (15.4.4) 

G i c g o ™ 

The matrix A is called the design matrix of the fitting problem. Notice that in general £ 8 jj 

A has more rows than columns, TV >M, since there must be more data points than 

model parameters to be solved for. (You can fit a straight line to two points, but not a 

very meaningful quintic !) The design matrix is shown schematically in Figure 1 5.4. 1 . | % a 

Also define a vector b of length TV by j[ 1 1 

bi = ¥i (15.4.5) *f 

and denote the M vector whose components are the parameters to be fitted, 
ai,...,a M , by a. 
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Figure 15.4.1. Design matrix for the least-squares fit of a linear combination of M basis functions to N 
data points. The matrix elements involve the basis functions evaluated at the values of the independent 
variable at which measurements are made, and the standard deviations of the measured dependent variable. 
The measured values of the dependent variable do not enter the design matrix. 

Solution by Use of the Normal Equations 

The minimum of (15.4.3) occurs where the derivative of x 2 with respect to all 
M parameters a k vanishes. Specializing equation (15.1.7) to the case of the model 
(15.4.2), this condition yields the M equations 



N 



' a- 



M 



X k (xi) & = l,...,Af (15.4.6) 



Interchanging the order of summations, we can write (15.4.6) as the matrix equation 

M 

Z>jy*i=A (15.4.7) 

where 



= ^ Xj{xi)X k {xi) 



k3 'h * 

an M x M matrix, and 

N 



or equivalently [a] = A r • A (15.4.8) 



ViX k (xi) 



or equivalently (/?] = A r ■ b (15.4.9) 
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a vector of length M. 

The equations (15.4.6) or (15.4.7) are called the normal equations of the least- 
squares problem. They can be solved for the vector of parameters a by the standard 
methods of Chapter 2, notably LU decomposition and backsubstitution, Choleksy 
decomposition, or Gauss- Jordan elimination. In matrix form, the normal equations 
can be written as either 



H-a = [£] or as (A T A) • a = A T 



(15.4.10) 



The inverse matrix C ik = [a]^ is closely related to the probable (or, more 
precisely, standard) uncertainties of the estimated parameters a. To estimate these 
uncertainties, consider that 



M 



M 



fc=l 



N 



yiX k (xj) 



(15.4.11) 



and that the variance associated with the estimate aj can be found as in (15.2.7) from 



Note that otj k is independent of so that 



M. 



Consequently, we find that 

M M 



N 

£ 

t=l 



Xk(xi)Xi(xi) 



(15.4.12) 



(15.4.13) 



(15.4.14) 



The final term in brackets is just the matrix [a]. Since this is the matrix inverse of 
[C], (15.414) reduces immediately to 



a 2 ( aj ) = C j: 



(15.4.15) 



In other words, the diagonal elements of [C] are the variances (squared 
uncertainties) of the fitted parameters a. It should not surprise you to learn that the 
off-diagonal elements C jk are the covariances between aj and a k (cf. 15.2.10); but 
we shall defer discussion of these to §15.6. 

We will now give a routine that implements the above formulas for the general 
linear least-squares problem, by the method of normal equations. Since we wish to 
compute not only the solution vector a but also the covariance matrix [C], it is most 
convenient to use Gauss- Jordan elimination (routine gauss j of §2.1) to perform the 
linear algebra. The operation count, in this application, is no larger than that for LU 
decomposition. If you have no need for the covariance matrix, however, you can 
save a factor of 3 on the linear algebra by switching to LU decomposition, without 
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computation of the matrix inverse. In theory, since A T • A is positive definite, 
Cholesky decomposition is the most efficient way to solve the normal equations. 
However, in practice most of the computing time is spent in looping over the data 
to form the equations, and Gauss- Jordan is quite adequate. 

We need to warn you that the solution of a least-squares problem directly from 
the normal equations is rather susceptible to roundoff error. An alternative, and 
preferred, technique involves QR decomposition (§2.10, §11.3, and §11.6) of the 
design matrix A. This is essentially what we did at the end of §15.2 for fitting data to 
a straight line, but without invoking all the machinery of QR to derive the necessary 
formulas. Later in this section, we will discuss other difficulties in the least-squares 
problem, for which the cure is singular value decomposition (S VD), of which we give 
an implementation. It turns out that SVD also fixes the roundoff problem, so it is our 
recommended technique for all but "easy" least-squares problems. It is for these easy 
problems that the following routine, which solves the normal equations, is intended. 

The routine below introduces one bookkeeping trick that is quite useful in 
practical work. Frequently it is a matter of "art" to decide which parameters a k 
in a model should be fit from the data set, and which should be held constant at 
fixed values, for example values predicted by a theory or measured in a previous 
experiment. One wants, therefore, to have a convenient means for "freezing" 
and "unfreezing" the parameters a k . In the following routine the total number of 
parameters a k is denoted ma (called M above). As input to the routine, you supply 
an array ia[l . .ma] , whose components are either zero or nonzero (e.g., 1). Zeros 
indicate that you want the corresponding elements of the parameter vector a [1 . . ma] 
to be held fixed at their input values. Nonzeros indicate parameters that should be 
fitted for. On output, any frozen parameters will have their variances, and all their 
covariances, set to zero in the covariance matrix. 

#include "nrutil.h" 

void If it (float xQ , f loat yd . float sigD , int ndat, float aQ , int iad , 

int ma, float **covar. float *chisq, void (*funcs) (float , float int)) 
Given a set of data points x[l..ndat], y[l..ndat] with individual standard deviations 
sig[l. .ndat] , use x 2 minimization to fit for some or all of the coefficients a[l. .ma] of 
a function that depends linearly on a, y = £\ a* x afunci(x). The input array ia[l. .ma] 
indicates by nonzero entries those components of a that should be fitted for, and by zero entries 
those components that should be held fixed at their input values. The program returns values 
for a[l . .ma] , x 2 = chisq, and the covariance matrix covar [1 . .ma] fl • -ma] . (Parameters 
held fixed will return zero covariances.) The user supplies a routine f lines (x,a± unc ,ma) that 
returns the ma basts functions evaluated at x = x in the array afunc[l. .ma] . 
{ 

void covsrt(float **covar, int ma, int iad, int raf it) ; 

void gauss j (float **a, int n, float **b, int m) ; 

int i, j .k.l^.mf it=0; 

float ym.wt ,sum,sig2i,**beta,*af unc; 

beta=matrix(l,ma,l,l) ; 
af unc=vector ( 1 , ma) ; 
for (j=l; j<=ma; j++) 

if (ia[j]) mfit++; 
if (mfit = 0) nrerrorClf it: no parameters to be fitted"); 
for (j=l; j<=mf it; j++) { Initialize the (symmetric) matrix. 

for (k=l;k<=mfit;k++) covar[j] [k]=0.0; 

betaCj] Cl]=0.0; 

> 

for (i«l ; i<=ndat ; i++) { Loop over data to accumulate coefficients of 

the normal equations. 
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(*f tuxes) (xCil ,afunc,ma) ; 
ym=y [i] ; 

if (mf it < ma) < Subtract off dependences on known pieces 

for (j»l; j<=ma; j++) of the fitting function, 

if (liaCjl) ym — a[j3*af uncCj] ; 

> 

sig2i=l . 0/SQR(sigCi] ) ; 
for (j=0,l=l;K=ma;l++) { 

if (iaCU) < 3 
wt=afunc£l3*sig2i; S^lll 
for (j++,k-0.m-l;m<-l;xa++) § i"^"® 

if (iaCm]) covar [ j 3 [++k] wt*afunc[m] ; b W l^iS 

betaCjHU += ym*vt; b~«S 

x } i §t«i 

} s^^si 

for (j=2; j<=mf it; j++) Fill in above the diagonal from symmetry. ItI^ 

for Ck-l;k<j;k++) go f o S 

covarCk] Cj>covar Cj] Ck] ; 9jg | » £ 

gauss j (covar ,mf it , beta, 1) ; Matrix solution. ro o = ? 3 

for (j=0 , 1=1 ; K=*ma; 1++) mdS^S 

if (iaClj) aCl3=betaC++j3 Cl3 ; Partition solution to appropriate coefficients § c m 

*chisq=0.0; a. z § 3 §• — 

for (i°l;i<=ndat;i++) { Evaluate x 2 of the fit. £ ® §r S 2 

(*funcs)(x[i3»af unc.ma); > 8 S j? P 

for (sum=0.0, j<=ma; j++) sum +- aCj] *af uncCj] ; 3 -| S TJ 5? 



♦chisq += SQR((yCi3-sum)/sig[i3); ^ ||| | 



m 

covsrt (covar ,ma,ia,mf it) ; Sort covariance matrix to true order of fitting 2- » g 3 o 

f ree_vector (af unc , 1 , ma) ; coefficients. 2 ;o *g "n 

free_matrix(beta,l,ma,l,l); ° .# 5" 3 O 



3 O 2. -1! 



That last call to a function covsrt is only for the purpose of spreading the ~ ^-o X g 

covariances back into the full ma x ma covariance matrix, in the proper rows and ^ o I - ? 

columns and with zero variances and covariances set for variables which were g. f » § g 

held frozen. § c | <| o 

The function covsrt is as follows. 3 1? 

#define SWAP(a.b) <swap»(a) ; (a)=(b) ; (b)=swap;> 1[ §.-§ | § 

void covsrt(float **covar, int ma, int iafj , int mfit) g o- g -j, — 

Expand in storage the covariance matrix covar, so as to take into account parameters that are <3 % £T g <? 

being held fixed. (For the latter, return zero covariances.) ? o o ® "~ 



int i,j,k; g" g 

float swap; ^ O o » 

for (i=mf it+l;i<=ma;i++) > < co 

for (J-l; j<=i;j++) covar[i] [jj=covarCjl Ci]=0.0; | S 2. 

k=mfit; f|| 
for (j=ma;j>=l; j— ) { ~ » 3; 

if (iaCj]) < ®3 
for (i=l;i<=ma;i++) SWAP (covar Ci] Ck] , covar [i] Cj] ) 
for (i=l;i<=ma;i++) SWAP (covar Ck] Ci] , covar Cj] Ci] ) 
k— ; 
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where the Wi are, as in §2.6, the singular values calculated by svdcmp. 

Equation (15.4.17) says that the fitted parameters a are linear combinations of 
the columns of V, with coefficients obtained by forming dot products of the columns 



3 



Solution by Use of Singular Value Decomposition 

In some applications, the normal equations are perfectly adequate for linear 
least-squares problems. However, in many cases the normal equations are very close 
to singular. A zero pivot element may be encountered during the solution of the 
linear equations (e.g;, in gauss j), in which case you get no solution at all. Or a 
very small pivot may occur, in which case you typically get fitted parameters a * 

with very large magnitudes that are delicately (and unstably) balanced to cancel out I s | §]f 

almost precisely when the fitted function is evaluated. % =s f J5"S 

Why does this commonly occur? The reason is that, more often than experi- g ~ J 

menters would like to admit, data do not clearly distinguish between two or more of | § 1 1 1 

the basis functions provided. If two such functions, or two different combinations 8 J" § § § 

of functions, happen to fit the data about equally well — or equally badly — then r ^ i* ^ S 

the matrix [a], unable to distinguish between them, neady folds up its tent and §g f og 

becomes singular. There is a certain mathematical irony in the fact that least-squares 3 'T jr I- £ 

problems are both overdeterrnined (number of data points greater than number of £ § | Q 

parameters) and underdetermined (ambiguous combinations of parameters exist); « g o c m 

but that is how it frequently is. The ambiguities can be extremely hard to notice o | » f z 

a priori in complicated problems. J 8 S jf ^ 

Enter singular value decomposition (S VD). This would be a good time for you 1 1 -I s m 

to review the material in §2.6, which we will not repeat here. In the case of an £ 5 

overdeterrnined system, SVD produces a solution that is the best approximation in J- £ 8 3 o 

the least-squares sense, cf. equation (2.6.10). That is exactly what we want. In the q J[*J | co 

case of an underdetermined system, SVD produces a solution whose values (for us, ^ ~ tn ~ 



the ajb's) are smallest in the least-squares sense, cf. equation (2.6.8). That is also % 

. what we want: When some combination of basis functions is irrelevant to the fit, that I § | <| g 

combination will be driven down to a small, innocuous, value, rather than pushed 5 s" 5 o § 

up to delicately canceling infinities. | * a I © 3 

In terms of the design matrix A (equation 15.4.4) and the vector b (equation !||?^ 

15.4.5), minimization of x 2 in (15.4.3) can be written as 1 1 * 



find a that minimizes x 2 = I A • a - b|* (15.4.16) § 5 9 



Comparing to equation (2.6.9), we see that this is precisely the problem that routines ? % = | § 

svdcmp and svbksb are designed to solve. The solution, which is given by equation J p. § 3 fj 

(2.6.12), can be rewritten as follows: If U and V enter the SVD decomposition | o o g ~ 

of A according to equation (2.6.1), as computed by svdcmp, then let the vectors |o| g 

U(i) i = 1, . . . , M denote the columns of U (each one a vector of length N); and ?o|| 
let the vectors V(^; i = 1, . . . , M denote the columns of V (each one a vector 5- * 

of length M). Then the solution (2.6.12) of the least-squares problem (15.4.16) § | a 

can be written as S I I 



* 2 
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of U with the weighted data vector (15.4.5). Though it is beyond our scope to prove 
here, it turns out that the standard (loosely, "probable") errors in the fitted parameters 
are also linear combinations of the columns of V. In fact, equation (15.4.17) can 
be written in a form displaying these errors as 



a = 



*M = £ ^(O]? - £ g) (15.4.19) 



± — V (x) ± • • . ± — V (Ar) (15.4.18) ^Sjoco 

^i K ) w M {M) v ; fs.ii i 

_ ^ v/ „ 1!t ^ ^UOLlUg la^L lO L11UL, ^ 

decomposed in this fashion, the standard deviations are all mutually independent g^o^g- 

(uncorrelated). Therefore they can be added together in root-mean-square fashion. f 1 1 § 3 

What is going on is that the vectors V^) are the principal axes of the error ellipsoid 8 <t £ | § 

of the fitted parameters a (see §15.6). 7f I 

It follows that the variance in the estimate of a parameter a,j is given by § § § j? > 

p a- ID 

|i||5 

whose result should be identical with (15.4.14). As before, you should not be >§ ^ 

surprised at the formula for the covariances, here given without proof, H-S 3 m 

Cov(^,a fc )=X:(^). (15.4.20) f j| j g 

We introduced this subsection by noting that the normal equations can fail ^o^2 

by encountering a zero pivot. We have not yet, however, mentioned how SVD = £ J J o 

overcomes this problem. The answer is: If any singular value Wi is zero, its a f § — § 

reciprocal in equation (15.4.18) should be set to zero, not infinity. (Compare the 1 1 1 S § 

discussion preceding equation 2.6.7.) This corresponds to adding to the fitted I ? § £ © 

parameters a a zero multiple, rather than some random large multiple, of any linear | i -n 8 ^ 

combination of basis functions that are degenerate in the fit. It is a good thing to do! ® » ^ ^ z 

Moreover, if a singular value Wi is nonzero but very small, you should also I Jg i « 

define its reciprocal to be zero, since its apparent value is probably an artifact of J*"®! §" I 

roundoff error, not a meaningful number. A plausible answer to the question "how § i f 5 § 

small is small?" is to edit in this fashion all singular values whose ratio to the |.fS§S 

largest singular value is less than N times the machine precision e. (You might s ° £ S 
argue for y/N, or a constant, instead of N as the multiple; that starts getting into 
hardware-dependent questions.) 

There is another reason for editing even additional singular values, ones large >"< <§ 
enough that roundoff error is not a question. Singular value decomposition allows 
you to identify linear combinations of variables that just happen not to contribute 
much to reducing the x 2 of your data set. Editing these can sometimes reduce the 
probable error on your coefficients quite significandy, while increasing the minimum 
X 2 only negligibly. We will learn more about identifying and treating such cases 
in §15.6. In the following routine, the point at which this kind of editing would 
occur is indicated. 

Generally speaking, we recommend that you always use SVD techniques instead 
of using the normal equations. S VD's only significant disadvantage is that it requires 



no w 
i?5 3 



3 co © 

* i § 

• S2.2 
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basis functions (and fitted parameters). If you care only about the values a of the 
fitted parameters, then u , v , w contain no useful information on output. If you want 
probable errors for the fitted parameters, read on. 3 || 1 1 

# include "nrutil.h" I^l^m 



an extra array of size N x M to store the whole design matrix. This storage 
is overwritten by the matrix U. Storage is also required for the M x M matrix 
V, but this is instead of the same-sized matrix for the coefficients of the normal 
equations. SVD can be significantly slower than solving the normal equations; 
however, its great advantage, that it (theoretically) cannot fail, more than makes up 
for the speed disadvantage. 

In the routine that follows, the matrices u , v and the vector w are input as Jf S ? § » 
working space. The logical dimensions of the problem are ndata data points by ma ? I: « &• y 

= • — o<s 

r> m u c; o 
£> CQ -»■ =T 

) 3 

^ to C 
EL 13 o" fg 

T* — 5" °" 33 

#define TOL l.Oe-5 Default value for single precision and vari- gjff^o 

ables scaled to order unity. g j| § § £ 

void svdf it (float xQ , float yd, float sigO , int ndata, float aG , int ma, S o = 5 33 

float **u, float **v, float w[] , float *chisq, -ii s> & Q- g 

void (*funcs) (float, float □, int)) Sflcm 

Given a set of data points x[l. .ndata] ,y[l. .ndata] with individual standard deviations z § 3 f* 52 

. sig[l. .ndata] , use x 2 minimization to determine the coefficients a[l..ma] of the fit- ^ ® % ® z . 
ting function y = J2i x afunci(x). Here we solve the fitting equations using singular . J 8 S ^ " 

value decomposition of the ndata by ma matrix, as in §2.6. Arrays u[l. .ndata] [1 . .ma] , 3 -§ § -o x 

. v[l..ma] [1. .ma], and w[l. .ma] provide workspace on input; on output they define the S. 5. "g 3 m 

singular value decomposition, and can be used to obtain the covariance matrix. The pro- ™ © " 3) 

. . gram returns values for the ma fit parameters a, and x 2 . chisq. The user supplies a routine 2. % g 3 q 

funcs(x,afunc,ma) that returns the ma basis functions evaluated at x = x in the array ^ 3 71 

afuncCl. .ma]. M?3n 

. / g -o Z " m 

• 3 3 3" V Z 

void svbksb(float **u, float vQ, float **v, int m, int n, float b[], 9- 3:~-§ d 

n— *a>; 3 f 83-8 

void svdcmp(float **a, int m, int n, float float **v) ; — ^-o j? 0 

- . int 1,i; ? o S <"> 2 



float wmax, tmp, thresh, sum, *b,*afunc; 



2 3. 3 to r. 
o -* a» ^ 

b=*vector(l , ndata) ; §. c w to O 

afunc=vector(l,ma) ; ®l-n*>5j 
for (i=l;i<*ndata;i++) { Accumulate coefficients of the fitting ma- ©H'a.^'ro 

(♦funcs) (x[i] ,afunc,ma) ; trix. o 2. ^ z 



tmp-1 . 0/sigCi] ; |® j3 « 

for (j=l; j<=ma;j++) u[i] [j] =»af unc Cj] *tmp; 

b[i]-yCi]*tmp; «g S o. B § 

> bo 2jo 

s vdcmp(u, ndata, ma, w,v) ; Singular value decomposition. ~. gT J 3 

wmax=0.0; Edit the singular values, given TQL from the |2 ° 2 

for (j=l; j<=ma; j-M-) #define statement, between here ... sgao 

if (w[jj > umax) wmax=w[j]; z £ Q ? 

thresh=TOL*wmax; 

for (j=l;j<=ma;j++) ><f ' 

if (wfjj < thresh) w[j]=0.0; ...and here. 3 

svbksb(u,w,v,ndata,ma,b,a) ; S 3 

*chisq=0.0; Evaluate chi-square. ^ §- o 

for (i«l;i<=ndata;i++) < % 5" 

(♦funcs) (x£i] ,afunc,ma) ; ? 

for (sum=0.0, j<=ma; j++) sum += a[j]*afunc[j] ; 

*chisq +— (tmp=(yCil-sum)/sig[i3 , tmp*tmp) ; 

> 

f ree_vector(af unc , 1 ,ma) ; 
f ree.vector (b, 1 , ndata) ; 
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Feeding the matrix v and vector w output by the above program into the 
following short routine, you easily obtain variances and covariances of the fitted 
parameters a. The square roots of the variances are the standard deviations of 
the fitted parameters. The routine straightforwardly implements equation (15.4.20) 
above, with the convention that singular values equal to zero are recognized as 
having been edited out of the fit. 

#include "nmtil.h" 



DOW 

5 m CD O 0) 

f lill 

3 CD _ 0*9 



void svdvar (float **v, int ma, float wO , float **cvm) 

To evaluate the covariance matrix cvm[l . .ma] Cl . - ma] of the fit for ma parameters obtained 

by svdfit, call this routine with matrices v[l. .ma] [1. .ma], v[l. .ma] as returned from | |S g| 

svdf it. oc|?^ 

int k.j.i; "31^5 

float sum, *vti; §S®oO 

o 3 3 b > 
o»3. 2. 3 *~ 

wti=*vector(l ,ma) ; — c cr 3 

for (i=l;i<=ma;i++) < ^ » S> J 1 O 

wti[i]=0.0; §^-^rn 

if CwCi]) vrti[i]=1.0/(w[i]*wCi]); ~ § 2 f C/> 

> fill 5 
for (i*»l;i<=ma;i++) { Sum contributions to covariance matrix (15.4.20). 5 o S j? ^ 

for Cj=l;j<-i;j++) { 3^ 5 u S 

for (sum=0.0 > k=l;k<=ma;k++) sum vCi] [k]*v[j] [k]*wti [k] ; . =• £13 S £ 

c vm [j ] [i] =cvm Ci] C j 3 -sum ; 8 "g P § 

> ■ . jf 22 3 
free_vector(vti,l f ma) ; 3 3* 3 o 

> lliog 

> IK If S 
Examples Isl?f 



j o. zj co ci 

i ffi M CD ^ 

Be aware that some apparently nonlinear problems can be expressed so that § z g 2 z 

they are linear! For example, an exponential model with two parameters a and fc, % | ^ 1 - 

y(x) = aexp(-&a;) (15.4,21) jftffs 

can be rewritten as I ? 3 I fS 

log[y(x)] = c - bx (15.4.22) ||| 1 1 

which is linear in its parameters c and 6. (Of course you must be aware that such J § § g <P 

transformations do not exactly take Gaussian errors into Gaussian errors.) |g"og w 

Also watch out for "non-parameters," as in ! Q § 

y[x) = aexp(-kt + <f) (15.4.23) f o3 | 

Here the parameters o and d are, in fact, indistinguishable. This is a good example of >k« 

where the normal equations will be exactly singular, and where SVD will find a zero §. % 
singular value. SVD will then make a "least-squares" choice for setting a balance 
between a and d (or, rather, their equivalents in the linear model derived by taking the 
logarithms). However — and this is true whenever SVD gives back a zero singular 
value — you are better advised to figure out analytically where the degeneracy is 
among your basis functions, and then make appropriate deletions in the basis set. 
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Here are two examples for user-supplied routines f uncs. The first one is trivial 
and fits a general polynomial to a set of data: 
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void f poly (float x, float pQ , int np) 

Fitting routine for a polynomial of degree np-1, with coefficients in the array p[l. .np] . 

-C 

int j; 
p[l]«1.0; 

for (j=2;j<=np;j++) pCj]=p[j-l]*x; 

S|i.ii 

The second example is slightly less trivial. It is used to fit Legendre polynomials § ® If 2 -S 
up to some order nl-1 through a data set. ^ f 

SO: co =T 

void fleg(float x, float plO , int nl) |||S| 
Rtting routine for an expansion with nl Legendre polynomials pi, evaluated using the recurrence 2 5" °- to c 

relation as in §5.5. 

int j; S|lS>2 

float twox,f2,f l,d; » — £ 3 j~ 

"V» s § £ m 
•ij pa cd Q> o 

pl[l>1.0; fe^SS^ 
pl[2]=x ' " * 

if (nl > 2) { if | ^ 

twox=2 . 0*x ; ^ o ®^ §L O 

f2=x; 
d=1.0; 



iV 

* 2 = £ 



. — . CD _ 13 CO 



for (j=3;j<=nl;j++) { « ?| S > 

f2 += twox; ° 
pl[j>(f2*pl[j-lJ-fl*pl[j-2])/d; Sj^lS 

> afgog 
Multidimensional Fits g°i-§§ 

c ^ c _^ Z 
21 eg S Q 

3 CO ^ 

If you are measuring a single variable y as a function of more than one variable 
— say, a vector of variables x, then your basis functions will be functions of a vector, S - ^ f o 

Xi(x), . . . , The x 2 merit function is now | ® s § & 

S o o" i> 

- ° <9 co ae. 2 

CD c — 

b « Q. 33 o 

(15.4.24) Slffg 
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c o o 
|oo CO 

All of the preceding discussion goes through unchanged, with x replaced by x. In ® f 

fact, if you are willing to tolerate a bit of programming hack, you can use the above E S;f 3 

programs without any modification: In both If it and svdf it, the only use made S^d* 
of the array elements x [i] is that each element is in turn passed to the user-supplied ; ^ ° 

routine f un.cs, which duly gives back the values of the basis functions at that point. 
If you set x [i] =i before calling If it or svdf it, and independently provide f uncs 
with the true vector values of your data points (e.g., in global variables), then f uncs 
can translate from the fictitious x [i] 's to the actual data points before doing its work. 
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15.5 Nonlinear Models 1 1 all 

< 3 § ~ » 
3 <D — OS 
n m n w O 
h a ^» 

We now consider fitting when the model depends nonlinearly on the set of M 3 a 1 § | 

unknown parameters ak,& = 1,2,..., M. We use the same approach as in previous g J* 2. g c 

^1 = ^3 



sections, namely to define a x 2 merit function and determine best-fit parameters 

by its irdnimization. With nonlinear dependences, however, the minimization must g o | o p 

proceed iteratively. Given trial values for the parameters, we develop a procedure ^ 

that improves the trial solution. The procedure is then repeated until x 2 stops (or S 1 1 J 1 o 

effectively stops) decreasing. w « 5 c m 

How is this problem different from the general nonlinear function minimization f 1 1 f 2 

problem already dealt with in Chapter 10? Superficially, not at all: Sufficiently | g f\| P 

close to the minimum, we expect the x 2 function to be well approximated by a l| 1 3 m 

quadratic form, which we can write as g? •§ p > 

x 2 (a)« 7 -da + iaDa (15.5.1) <tjf§8 

l|ff I 

where d is an M- vector and D is an M x M matrix. (Compare equation 10.6.1.) 1 i § J- ^ 

If the approximation is a good one, we know how to jump from the current trial == Pj = o 

parameters a cur to the minimizing ones a min in a single leap, namely ^ 2 § "§ ? 

a rain = ac Ur + D" 1 • [-V X 2 (a cur )] (15.5.2) 1 1§ 5 q 

(Compare equation 10.7.4.) ® eLj^z^ 

On the other hand, (15.5.1) might be a poor local approximation to the shape I f J I § 

of the function that we are trying to minimize at a cur . In that case, about all we J:"S 3 |X 

can do is take a step down the gradient, as in the steepest descent method (§10.6). bg|jg 

In other words, ~&?%& 

a ne xt = a cur - constant x Vx 2 (a cur ) (15.5.3) • §| f 

3. ^3 3 

where the constant is small enough not to exhaust the downhill direction. >'< <□ 

To use (15.5.2) or (15.5.3), we must be able to compute the gradient of the x 2 | f 3 
function at any set of parameters a. To use (15.5.2) we also need the matrix D, which 
is the second derivative matrix (Hessian matrix) of the x 2 merit function, at any a. 

Now, this is the crucial difference from Chapter 10: There, we had no way of 
directiy evaluating the Hessian matrix. We were given only the ability to evaluate 
the function to be niinimized and (in some cases) its gradient. Therefore, we had 
to resort to iterative methods not just because our function was nonlinear, but also 
in order to build up information about the Hessian matrix. Sections 10.7 and 10.6 
concerned themselves with two different techniques for building up this information. 
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